Ensemble Clustering using Semidefinite Programming

نویسندگان

  • Vikas Singh
  • Lopamudra Mukherjee
  • Jiming Peng
  • Jinhui Xu
چکیده

We consider the ensemble clustering problem where the task is to 'aggregate' multiple clustering solutions into a single consolidated clustering that maximizes the shared information among given clustering solutions. We obtain several new results for this problem. First, we note that the notion of agreement under such circumstances can be better captured using an agreement measure based on a 2D string encoding rather than voting strategy based methods proposed in literature. Using this generalization, we first derive a nonlinear optimization model to maximize the new agreement measure. We then show that our optimization problem can be transformed into a strict 0-1 Semidefinite Program (SDP) via novel convexification techniques which can subsequently be relaxed to a polynomial time solvable SDP. Our experiments indicate improvements not only in terms of the proposed agreement measure but also the existing agreement measures based on voting strategies. We discuss evaluations on clustering and image segmentation databases.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Recurrent Neural Network Model for Solving Linear Semidefinite Programming

In this paper we solve a wide rang of Semidefinite Programming (SDP) Problem by using Recurrent Neural Networks (RNNs). SDP is an important numerical tool for analysis and synthesis in systems and control theory. First we reformulate the problem to a linear programming problem, second we reformulate it to a first order system of ordinary differential equations. Then a recurrent neural network...

متن کامل

The ensemble clustering with maximize diversity using evolutionary optimization algorithms

Data clustering is one of the main steps in data mining, which is responsible for exploring hidden patterns in non-tagged data. Due to the complexity of the problem and the weakness of the basic clustering methods, most studies today are guided by clustering ensemble methods. Diversity in primary results is one of the most important factors that can affect the quality of the final results. Also...

متن کامل

Advanced Optimization Laboratory Title: Approximating K-means-type clustering via semidefinite programming

One of the fundamental clustering problems is to assign n points into k clusters based on the minimal sum-of-squares(MSSC), which is known to be NP-hard. In this paper, by using matrix arguments, we first model MSSC as a so-called 0-1 semidefinite programming (SDP). We show that our 0-1 SDP model provides an unified framework for several clustering approaches such as normalized k-cut and spectr...

متن کامل

Robustness of SDPs for Partial Recovery of Clustering Subgaussian Mixtures

In this paper, we examine the robustness of a relax-and-round k-means clustering procedure, a method for clustering subgaussian mixtures using semidefinite programming first introduced in [MVW16]. We are interested in the robustness of the algorithm when there is an adversarial corruption of N points each through distance at most R0. We show that under such corruption this specific algorithm we...

متن کامل

Fast Low-Rank Semidefinite Programming for Embedding and Clustering

Many non-convex problems in machine learning such as embedding and clustering have been solved using convex semidefinite relaxations. These semidefinite programs (SDPs) are expensive to solve and are hence limited to run on very small data sets. In this paper we show how we can improve the quality and speed of solving a number of these problems by casting them as low-rank SDPs and then directly...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Advances in neural information processing systems

دوره 20  شماره 

صفحات  -

تاریخ انتشار 2007